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ABSTRACT 

We derive the hnear correction term for needlet and wavelet estimators of the bispectrum and 
the non-linearity parameter /nl on cosmic microwave background radiation data. We show that on 
masked W AI APA^ke data with anisotropic noise, the error bars improve by 10-2 0% and almost reach 
the optimal error bars obtained with the KSW estimator (jKomatsu et al.lT2005[ ). In the limit of full- 
sky and isotropic noise, this term vanishes. We apply needlet and wavelet estimators to the WMAP 
7-year data and obtain our best estimate /nl = 37.5 ± 21.8. 

Subject headings: cosmic microwave background — cosmology: observations — early universe — 
methods: data analysis — methods: statistical 



1. INTRODUCTION 

It is well known that most inflationary models pre- 
dict the fluctuations in the Cosmic Microwave Back- 
ground (CMB) to be close to but not exactly Gaussian. 
Non-Gaussian predictions are strongly model dependent, 
thus making primordial non-Gaussianity (NG) a power- 
ful tool to discri minate among d ifferent Early Univers e 
scenarios (see e.g iBartolo et all (|2004aH : iChen I (|2010f ): 
iLiguori et al. I (|2O10t ) and references therein). 

In this paper we will focus on so called local non- 
Gaussianity, which can be parametrized in the simple 
form: 

$(x) = $i(x) + /ii?r' (*i(x) - (<i>i(x))) , (1) 

where $(x) is the primordial curvature perturbation 
field at the end of inflation and $l(x) is the Gaussian 
part of the perturbation. The dimensionless parame- 
ter f^{f^ describes the amplitude of non-GaussianitjU. 
Local non-Gaussianity is predicted to arise from stan- 
dard single-field slow-roll inflation (jAcauaviva et al.l 
120031: iMaldacena I [2003). although at a very tiny level, 
as well as from multi-field inflationary scenario s, like 
the curvaton f Moillorach 1990; Enavist & Sloth 1 120021 : 
iLyth & Wandsj |2002e iMoroi fc Takahash i I l2001ll or in- 
hom ogcnco us (pre)reh eating models (D vali et al.l 120041 : 
iKolb et al.i 120051 12006D ). Even alte rnatives to inflation, 
such as ekpyrotic and cy clic models (jKovama et al. 2007; 
iBuchbinder et al.ll2008f ) predict a local NG signature. 
The expected non-Gaussian amplitude /nl varies sig- 
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nificantly from model to model. For example, standard 
single-field slow -roll inflation predicts /nt, ~ 10~^ at the 
end of inflation (jAcquaviva et al.ll2003l : lMaldacena1l2003| l 
(and therefore a final value ~ unity after general rela- 
tivistic s econd-order perturba tion effects are taken into 
account (jBartolo et al.|[20"04bl i3)). Such a small value is 
not experimentally detectable and for this reason a detec- 
tion of a primordial non-Gaussian signal in present and 
forthcoming CMB data will rule out single-field slow-roll 
inflation as a viable scenario. Motivated by these consid- 
erations many groups have attempted to measure /nl us- 
ing CMB datasets, and Wilkinson Microwave Anisotropy 
Probe {WMAP) data in particular. 

Consistently with theoretical findings, the most strin- 
gent NG bounds have been obtained using estima- 
tors of the CMB angular bispectrum (namely the 
three-point function of CMB fiuctuations in harmonic 
space). While in the classical approach to /nl es- 
timation |K omatsu ct al. 2005; CrcmincUi et al. 200(1 
lYadav fc Wandelt 1 120081: lYadav et al.ll2007bD the start- 
ing point to build an optimal cubic statistic is a direct 
multipole expansion of the temperature field, alternative 
representations can be u sed as well, like e .g. th e modal 
bispectrum expansion of iFergusson et all ()2009[ ). or bis- 
pectra of wavelet and needlet coefficients. Since all these 
approaches are just based on expanding the same quan- 
tity (the angular bispectrum) in different bases (polyno- 
mial modes, wavelets, needlets etc.), they are also ulti- 
mately expected to yield very similar results when ap- 
plied to data. This is indeed the case. For example, a 
recent estimate usi ng an optim al bispectrum estimator 
has been made by ISmith et al.l (|2009l) . They obtained 
the smallest error bars on /nl to date, finding /nl = 
38 ± 21 on WMAP 5-year data. Consisten t results, al- 
though with larger error bars, were f ound bv lCurto et al.l 
(2009a) and iPietrobon et all (|2009[ ). using parts of the 
bispe ctrum of Spherical M exican Hat Wavelets (SMHW) 
(.Martinez-Gonzalez et al.. .2002,) and the skewne ss of 
needlet coefficients, and bv IFergusson et aP (|2010l ). us- 
ing a modal bispectrum expansion. The most updated 
optimal result has been found by the WMAP team 
on the 7-year data w ith the estimate /nl = 32 ± 21 
(jKomatsu et al.l 1201 ID . Wavelets provide again a very 
similar result of /nl — 30 ± 23 (Fisher matrix bound 
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ap = 22.5) (|Curto et alJl2011bl) . 

At this point one could reasonably ask why it is use- 
ful to implement estimators using many different bispec- 
trum representations. After all in the end they are all 
expected to produce basically the same output in terms 
of /nl- The justification is two- fold. First of all, dif- 
ferent expansions can provide information beyond /nl 
like mode spectra and full bispectrum reconstruction 
Fergusson etall 120091 I2O1O0 V Second of all, different 
expansions can present important practical advantages, 
such as computational rapidity or robustness to a number 
of contaminants and effects (masking, non-stationarity of 
the noise, foreground emission and so on). The WMAP 
7-year results quoted above seem indeed to illustrate the 
latter point well. When we compare the bispectrum 
and the wavelet results we see that central values and 
error bars are both very similar. However, to achieve 
this result the bispectrum estimator needs to include a 
very important linear correctio n term. This additiona l 
term, originally introduced in iCreminelli et al.l ()2006[ ). 
subtracts from the measured three-point function a spu- 
rious /nl contribution due to the breaking of statisti- 
cal isotropy introduced by masking and non-stationary 
noise. Without this contribution the error bars of the 
bispectrum estimator would be much largeiQ than the 
quoted 21 (a factor at least 4 or 5 larger, a s shown for 
example in Fig. 4 of ICreminelli et ahl (|2006f) ). It turns 
out however that despite having an error bar only ~ 10% 
larger than the bispectrum measurement, the WMAP 
7-year wavelet result of / nl = 30.0 ± 23 (F isher ma- 
trix bound ap = 22.5) (|Curto et al.l 1201 IbO was ob- 
tained without including any linear correction. It seems 
then that the wavelet expansion is much less affected 
by masking and anisotropic noise than the bispectrum 
estimator is. This raises two important iss ues, firstly 
pointed out in iFergusson and Shellard I (j201lD . The first 
is why wavelets seem so much more efficient than a stan- 
dard harmonic decomposition in dealing with breaking of 
isotropy in the data. This point was partly addressed in 
(jCurto et a l. 2011c), but we think that no definite and 
conclusive explanation has been provided to date (see 
also paragraph I2.3.3P . The other issue is whether it is 
possible to further reduce also the variance of wavelet- 
based estimators through the introduction of the linear 
correction term. The aim of this work is to address both 
these questions: firstly we explicitly derive the linear cor- 
rection for needlet and wavelet estimators of the bispec- 
trum and of the non-linearity parameter /nl- We show 
how this linear term identically vanishes for full-sky maps 
with isotropic noise; we also explain why this term is in 
general smaller than for harmonic space based estima- 
tors, although non-negligible (Section [2]). We then imple- 
ment the linear term correction on a needlet bispectrum 
estimator and show that these results are confirmed by 
simulations; the procedures are then applied to WMAP 
7-year data (Section [3]). Conclusions are drawn in Sec- 
tion g] 



® Note that the correction is very large for local NG, but much 
smaller for other types of NG, hence the reason to consider only 
local NG in this paper, which is entirely focused on issues related 
to the linear term 



2. MOTIVATION FOR THE LINEAR TERM 
CORRECTION 

2.1. Some background results 

2.1.1. Wick products 

We recall first some well-known background facts, to 
fix notation. Consider Gaussian variables Xi, X2, X3, 
such that (Xi) = 0, (Xf) = {X,Xj) = a^j. The Wick 
product of the three variables is defined as 

: Xi, X2, X3 := X1X2X3 — 0^12X3 — ai3X2 — cr23Xi . (2) 

Example 1 For Xi = X2 = X3 = X 

■.X,X,X X^ - 3a'^X . 

For fT^ = 1 this is the well-known Hermite polynomial of 
order 3, H^iX) ^ X^ - 3X. 

For the expected values of Wick products, the following 
Diagram Formula holds 

(: Xii,Xi2,Xi3 :: ^^21,^^22,^^23 '■} 
= {XnX2i) {X^2^2i) (^13^:23) + 5 permutations (3) 

where the only permutations that are considered 
are those such that in each pair an element 
from the first triple {Xw^Xyii^x'iS is coupled with 
an element from the second triple {A^2i, A^22, ^"23}. 
It is usually convenient to visualize the elements 
{Xii,Xi2,Xi3},{jr2i,-'f22,-'^23} as vcrticcs aligned on 
two different rows, and the pairs as edges connecting two 
different vertices; the above-mentioned diagram formula 
is then usually expressed by stating that "flat edges" are 
ruled out. 

Example 2 For the Hermite polynomial we have 
(iH,iX)f) = ({X'-Sxf)=6 . 

We see that the expected value of the square of the 
third order Wick product is much smaller than the ex- 
pected value of ^(X-^)^^ = 15. In fact, given Gaussian 

random variables with unit variance a standard argu- 
ment can be used to prove that Wick products yield 
the smallest variance among all other polynomials of the 
same order. This result is well-known and can be found 
in any monographs on related subjects, see for instance 
(Marinucci & Pcccati 2011; Pcccati & Taqqu 201l|) for 
two recent references; we provide here a short proof for 
the case of cubic polynomials for the sake of complete- 
ness. Indeed consider Gaussian zero mean, unit vari- 
ance random variables Xi , X2 , ^3 , not necessarily inde- 
pendent, and form a generic polynomial 

P{Xi,X2,X3) := X1X2X3 + C1X2X3 + C2X1X3 
+C3X1X2 + C12X3 + C13X2 + C23X1 + C123 , (4) 

where the c's are arbitrary (fixed) real numbers. We can 
rewrite 

P{Xi, X2, X3) =: Xi, X2, X3 : 
+C1X2X3 + C2X1X3 + C3X1X2 

-f (Ci2 + {XiX2))X3 + (Ci3 + {XiX3))X2 
+ (C23 + {X2X3))Xi + C123, 

=:X^,X2,X3:+Q{Xi,X2,X3) , (5) 



On the linear term correction for needlets/wavelets NG estimators 



3 



where Q{Xi, X^) is a second order polynomial in 
{Xi,X2,X3). Now, it is readily seen that : Xi,X2,X3 : 
is uncorrelated with any polynomial of order 1 or 2 in 
the same variables; for instance 

(: Xi, X2, Xs : X1X2) = 

- {X1X2) {X3X1X2) 
- - {X2X3) {X^X2) - , (6) 

because odd moments of Gaussian variables always van- 
ish. Hence we have 

Var{P{Xi,X2,X3)} 
= Var{:Xi,X2,X3 : +Q{Xi, X2, X3)} 
= Var{:Xi,X2,X3 :} + Var {Q{Xi, X2, X3)} 

+2Cov{:Xi,X2,X3 ■.,Q{Xi,X2,X3)} 
^Var{:Xi,X2,X3 :} + Var {Q{Xi, X2, X3)} 

>Var{:Xi,X2,X3:} , (7) 

whence the result is established. In words, Wick poly- 
nomials of order 3 are uncorrelated by construction with 
any other polynomial of smaller order in the same ran- 
dom variables, and hence minimize the variance in the 
class of cubic polynomials of unit coefficient in the max- 
imal term. This provides the heuristic rationale for the 
introduction of linear correction terms in standard and 
wavelet /needlet bispectrum estimators. 

2.1.2. Needlets, Mexican needlets and SMHW 

Let bit) be a weight function satisfying three condi- 
tions, namely 

• Compact support: h(t) is strictly larger than zero 
only for t G [B~^,B], some B > 1 

• Smoothness: b{t) is C°° 

• Partition of unity: for all £ = 1, 2, ... we have 
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B3 



Recipes to construct a function h{t) that satisfy these 
con ditions are easy to find and are provided for instance 
by ( Marinucci et al.l I2OO80 and (jMarinucci fc Peccati I 

mm . 

Consider now a a grid of points {^jfc} on the sphere, e.g. 
the HEALPisH centres (jGorski et al.ll2005D : the needlet 
system is then defined by 



with the corresponding needlet coefficients provided by 



Pjk = \/'>^jk / J{x)ipjk{x)dx 

' / e \ 

E E b[^ja,„,Ye„,{^jk) ■ (9) 



^'^ http://healpix.jpl.nasa.gov 



The coefficients {Xjk} are proportional t o the pixel 
area, see for instance ()Baldi et al.l l2009bD for more 
detail s. The needlet idea has been extended by 
( Gelle r &: Maveli I [2009allH ) w ith the constru c tion o f so 
called Mexican needlets (see !Scod eller et al.l (j2011[) for 
numerical analysis and implementation in a cosmological 
framework). Loosely speaking, the idea is to replace the 
compactly supported kernel b{£/B^) by a smooth func- 
tion of the form 



Bi J V 



(10) 



for some integer parameter p. Mexican needlets have 
extremely good localization properties in real space, and 
for p — 1 they provide at high frequencies a good approx- 
imation to the so-called Spherical Mexican Hat Wavelet 
(SMHW) construction. The latter is exp l oited for in- 
stance by 'Mar tinez-Gonzalez et al.l (|2002f) ; iCurto et al.l 
(2009a, 2011a.^ to which we refer for more discussion 
and definitions. In short, the SMHW coefficients at lo- 
cation n and scale R are provided by 

w{n,R)^ f{x)^{x,n]R)dx , (11) 

where the wavelet filter is defined as 

1 1 /l,\2 



*(x,n;i?) 



2^N{R) 



[l+(f) ][2 



(12) 

here, N{R) = R^/TVWj2TW/l is a normalizing con- 
stant and y — 2tan^?/2 represents the distance between 
x and n, evaluated on the stereographic projection on 
the tangent plane at n; 9 is the corresponding angular 
distance, evaluated on the spherical surface. 

2.2. The linear correction term for the KSW estimator 

For our arguments to follow, and to allow for a proper 
comparison with the results for needlet/wavelet based 
estimators below, we shall provide a brief heuristic argu- 
ment to motivate the need for a linear ter m correction 
(iCreminelh et al.ll2006l: lYadav et al]|2007af) in the well- 
known KSW bispectrum estimator for the /nl param- 
eter In particular, let us consider any three frequencies 
^1,^2, -^3 satisfying standard triangular conditions, and 
consider the angle-averaged bispectrum 



Bp 



(2£i-f 1)(2^2 + 1)(24 + 1) / ^1 £2 £: 
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Z-^ \ mi 7712 7723 

mim2m3 





^iim-i ^£21^2 ^^sTn^ 



^ ^ ^^?imi ^^2^^2 ^^3^3 

mim2m3 

X Ye^mi {x)Yl^rn2 {x)Yl^m3 {x)dx 

Ti^{x)Tt^{x)Tt^{x)dx , 



/S2 

Ti{x) = y^^ahnYlira{x) 



(13) 



where we have chosen a convenient (albeit non-standard) 
normalization for the bispectrum to make our argument 



Donzelli et al. 



notationally simpler - these normalizations do not affect 
by any means the substance of the argument. Now, the 
bispectrum should be more properly written as 



:i^2^3= / {■Te^ix),Ti^ix),Ti^{x) :}dx 
{Ti^ {x)Ti^ {x)Tt^ {x)} dx 
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{V i^i^{x)Ti^{x) 

+ Ti^i^Tt^{x) +Vt^t^Ti^{x)}dx , 
r,„,„ (x) = (T,„ {x)Te^ (x)) ,u,v^ 1,2,3, (14) 

In the presence of full sky-maps with isotropic noise we 
have that Ti^i^ (x) = Tg^i^ , i.e. it would be constant over 
pixels, whence for instance 



{Ti^i^{x)Ti^{x)}dx = 
{Tt^{x)}dx = 



S2 



{Ti^i^Ti^{x)}dx 







(15) 



because 



S2 



aimYim{x)dx 
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Yem{x)dx = 



(16) 

(assuming the monopole is zero). On the other hand, in 
the presence of anisotropic noise and / or masked maps the 
previous argument cannot hold, whence the linear term 
does not cancel and the variance of the Wick product 
(including the Wick product) is systematically smaller 
than any other cubic statistic; for instance, as before 




T!{x)dx 



{S^\M)x(S^\M) 



{Te{x)Ti{y)f dxdy 



(17) 



(Tlix)) {Tf{v)) {Te{x)Te{y)) dxdy 



(S2\M)x(S2\M) 



Here, we have used S'^\M to denote integration over the 
sphere S minus the masked region M. In the full-sky 
case M = %, with isotropic noise, we have 
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(18) 



whence the previous expression becomes 
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X / P/(x • y)dxdy 
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2^-1-1 
47r 



{c; 



CMB 



cv 



(S2\M)x(S2\M) 



Pt{x ■ y)dxdy 



2^+1 
47r 



{C\ 



CMB 



000 



(19) 



In the previous computations, we have written Pi for 
Legendre polynomials, x ■ y for scalar products, we have 
used the well-known Wigner's 3j symbol arising from the 
(Gaunt) integral of the third power of Pi, and we ex- 
ploited the well-known fact that 



5^x52 



P£{x ■ y)dxdy = 



(20) 



whence the so-called flat edges terms vanish. This is not 
so in general, though. Note, however, that 




[T!ix)~3{T!ix))Te{x)]dx 



S'^\MxS^\M 



{Te{x)Te{y))^ dxdy 



(21) 



so that the flat terms are cancelled, even in the presence 
of anisotropic noise or masked regions. 

2.3. Needlets /wavelets Non-Gaussianity estimators 

The situation for wavelet or needlet/like non- 
Gaussianity estimators is to some extent analogous to 
the one for the K SW procedure. For instance, in 
(|Curto et all 1201 lal ). Eqs. (14-16), the variance of the 
following statistic is considered 
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X / w{Ri,ni)w{Rj,ni)w{Rk,ni)dni, (22) 



47r tJiCTjCrfe Jg 

where each w{Ri,ni) is the (random) SMHW coefficient 
at scale Ri and location ni (Eq. (|lip ): this variance can 
clearly be written as 

1 1 1 



(47r)2 aiGjOk (TrfysOt 

w{Ri,ni)w{Rj,ni)w{Rk,ni) 



w{Rr,n2)w{Rs,n2)w{Rt,n2))dnidn2 



(23) 



For full-sky maps with isotropic noise, we would have 
that 

{w{Ri,ni)w{Rj,ni)w{Rk,ni)w{Rr,n2)w{Rs,n2)w{Rt,n2)} 
{w{Ri,ni)w{Rr,n2)) {w{Rj,ni)w{Rs,n2)) 
X {w{Rk,ni)w{Rt,n2)) +5 permutations, (24) 

e.g., the only terms that non- vanish are those where 
ni and n2 appear in the same pair. However, as dis- 
cussed above, in the presence of masked regions and/or 
anisotropic noise the terms with ni,ni or 71.2,71.2 do not 
vanish, and we should rather write 

{w{Ri, ni)w{Rj , ni)w{Rk, ni)w{Rr, n2)w{Rs,n2)w{Rt,n2)) 
= {w{Ri,ni)w{Rr,n2)) {w{Rj , ni)w{Rs, n2)) 
X {w{Rk,nx)w{Rt,n2)) + 14 permutations, (25) 

i.e., there are 9 further "flat" permutations (those where 
two terms from the same row are coupled - there are three 
different way for each row to do this). These missing 
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terms give to the integral a contribution of the form 
X {w{Rj,n2)w{Rs,n2)) {w{Rk,ni)w{Rt,n2)) dnidn2 

= {4ly^ aj,a, aJ^aJ s-xS'^ {w{R„ ni)w{Rr , Ui)) 

X {w{Rj,n2)w{Rs,n2)) {w{Rk,ni) 

X w{Rt, 712)) dnidn2 (26) 

Now, for isotropic noise, {w{Ri,ni)w{Rr,ni)) ~ const, 
whence the previous quantity is approximately propor- 
tional to 

/52XS2 {w{Rk,ni)w{Rt,n2)) dnidn2 
= (/s2 w{Rk,ni)dni x /^^ w{Rt,n2)dn2) , (27) 



because 



^^{Rk, ni)dni ~ 



(28) 



in the absence of masked regions. So when the celestial 
sphere is fully observed, it is equivalent to consider or 
not the extra, "flat" terms with the same indexes rti,n2 
in the pairs. 

As stated earlier, and as for the KSW estimators, these 
terms are no longer identically zero in the presence of 
masked regions or anisotropic noise. A linear correction 
term can therefore be needed; we discuss its derivation 
in the subsection below. 

2.3.1. The linear correction term 

According to our previous argument, it is straightfor- 
ward to see how, to decrease the variance of the wavelet 
cubic statistic, it is enough to change the cubic statistic 
from 



w{Ri,n)w{Rj,n)w{Rk,n)dn (29) 



to 



(: w{Ri,n), w{Rj,n), w{Rk, n) :) dn (30) 



i.e. subtract a linear term of the form 

w{Ri, n)w{Rj, n)w{Rk, n)dn 



{w{Ri, n)w{Rj, n)) w{Rk, n)dn 

{■w{Ri, n)w{Rk,n)) w{Rj, n)dn 

{w{Rk,n)w{Rj,n)) w{Ri,n)dn . (31) 

A similar situation exists for the needlets bis pectrum 
(jLan fc Marinucci I l2008at iRudiord et all l2009f ). which 
we can implement as 

- ^^^3 (fc)/3j2fc - ^3^33 i.^)Phkl } : (32) 



where fijk are the usual coefficients for (standard or Mex- 
ican) needlets and 



Of course, under isotropic noise 



(33) 



^^1^2 (^) 



B32 



2e + i 

An 



(34) 

Note that the linear terms have expected value zero al- 
ways 

(r,u2 {k)f3,3k) = r,u. (fc) (ft3fc3) = ; (35) 

however the observed value of these terms over one re- 
alization of the sky is exactly equal to zero only if the 
sum (or the integral) is taken over the whole sphere and 
the noise is isotropic (assuming again zero monopole), 
i.e. for SMHW 



{w{Ri , n)w{Rj , n)) w{Rk , n)dn 



= {w(R,,n)w[Rj,n)) I w{Rk,n)dn^O, (36) 
and correspondigly for the needlets 

k3 ki 

i.e. when there are no masked regions and the noise 
is isotropic. These are the assumptions under which 
the behavior of the ncedlet bispectrum was investigated 
by (|Lan fc Marinucci 2008a), where it was firstly intro- 
duced in the statistical literature. 

It should be noted, moreover, that in practical situations 
the contribution of the linear term for wavelet /needlet- 
like bispectrum estimators will be typically smaller than 
for KSW. This can be explained as follows: consider 



k k Im 

= Er...2(fc)E^(^) Lj(-)'^ 
- / ^3132 (y) E ^ 



{x)Y lm{x)Ytm{ij3k)dx 



T{x)Pe{x ■ y)dxdy 



(3i 



Now it is a consequence of needlet concentration in pixel 
space that 

r,u,(y) b (^^^ J^^ T{x)PfXx ■ y)dxdy 



T{x) 



dx 



I n^) I r,,,2(y)^fe 



Pi{x ■ y)dy 



dx 



(39) 

where N^{x) is a small neighborhood of x\ now assuming 
that noise is approximately constant over B^(x), we can 
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write 



because by localization 



£ 



/ ^3132 (y) ^ 

JBAx) I 



Pi(x ■ y)dy 
Pe{x ■ y)dy 



(40) 



Pi{x ■ y)dy 



(41) 



2.3.2. The relationship with mean subtraction 

We shall now show how, in the presence of nearly 
isotropic noise, the behaviour of the linear term is well- 
approximated by subtracting scale-by-scale the sky aver- 
age of wavelets or needlets coefficients. Indeed, define 



(42) 



then 



^ Y.^Pnk - P,M2k - P,M,k - /3,3) 

k 

k V k , 



(43) 



Now for nearly isotropic noise, e.g., if the covariance 
among coefficients 



riii2(fc) = WnkPnk) 



- 3132 



(44) 



is nearly constant over the sky, then, for high enough j, 
it is natural to expect that the cross-correlation among 
wavelet coefficients evaluated on a sky realization will be 
close to the ensemble average. 



(45) 



(46) 



Moreover we also expect 
whence 

\ E f^3lkl3j2k \ ^ '^jjj2j3 - rj2j3^ E ^J'^ ' 

(47) 

and similarly for the permutation terms, whence the lin- 
ear term will be well-approximated by mean subtraction. 
At smaller frequencies j, this argument will not work, 
but at these scales noise is likely to be negligible. In gen- 
eral, it should be noted that the approximation will work 



better in cases where noise anisotropy is not extremely 
relevant, and less well otherwise. Also, for higher order 
polyspectra or alternative statistics (such as KSW) the 
equivalence between mean subtraction and linear term 
correction will no longer hold. 



2.3.3. A toy counterexample 



In iCurto et al.l ()2011c[ ). Section 3.2 it is claimed that 
the linear term is of order /^^ ^-i^d hence negligible. This 
is motivated on the basis of the asymptotic uncorrelation 
of the wavelet coefficients, implying the validity of the 
Central Limit Theorem (CLT) for the cubic (bispectrum 
statistic). 

We do agree on the uncorrelation of the coefficients and 
the Central Limit T h eorem taking place: see fo r instance 
(|Baldi et al.ll2009aD . (|Lan fc Marinucci |[2008allbl ) for an- 
alytic arguments in the needlets and Mexican needlets 
case. We fail to see, however, why this should necessarily 
imply that the resulting linear term should be negligible. 

As a toy counterexample, consider a sequence of inde- 
pendent Gaussian random variables Xi, with zero mean 
{Xi) and nonconstant (e.g. anisotropic) variance (Xf) — 
af; to fix ideas, we shall take af = 1 for i = 1, 3, 5, (i.e. 
odd) af = 3 for z = 2,4,..., (i.e. even). With these 
assumptions, we try to mimic the behavior of nearly in- 
dependent wavelet coefficients with anisotropic noise; for 
simplicity, we neglect the effect of masked regions, but 
the argument would be analogous. Now consider the 
statistic 



B 



-3 



1 

^"^v^E^' 
^ ?— 1 

_^ n 1 ^ 

B2n = —;= y [Xi ~ Xn) , Xn = — > Xi , 

Jn ^-^ n ^-^ 

^ n ^ n 

Bsn = —p= 2^ (: Xi, Xi,Xi :) = —= \ (Xf - SafXi 



which correspond, in these circumstances, to the naive 
cubic statistics/bispectrum (without linear term), the 
cubic statistic with mean subtraction, and the proper 
bispectrum with Wick polynomials/linear term subtrac- 
tion. Because the Xi are exactly independent, it is triv- 
ial to see that the CLT holds, and all three statistics are 
asymptotically Gaussian. Nevertheless, it is readily seen 
that 



Var{B„.} = ^i:i'«rW) = ii:{X«> 

1=1 



t6 I ^6 



= 210 



because (A"f) — IStif by Wick's Theorem / Diagram 
Formula Eq. ([3]) (which in this case must include the 
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so-called flat edges). On the other hand 
1 " 

Var {B3„} ^-Y^Var {xf - iajX, ] 

r) ^ — ^ 



The difference between the variance with mean subtrac- 
tion and the linear term is given by 

Var{B2n}-Var{B3n}^ 



1 " 

1=1 

n 

1=1 



= 84 



We see thus that the linear terms is indeed not negligi- 
ble {Var{3afX,} = 9af, as compared to Var{Xf} = 
15af), and, in view of its negative correlation with the cu- 
bic statistics, it induces a major decrease in the variance. 
Concerning mean subtraction, simple computations show 
that 

1 " 
V 1=1 

V i=i V 1=1 yj=i J 

3 " 1 " 

^ Z=l ^2=1 

Now the third and fourth term are easily seen to be con- 
verge to zero, from the law of large numbers. For the 
second summand, switching sums we obtain 



3 1 

n 



where, again by the law of large numbers we have the 
convergence (with probability one) 




Hence, neglecting terms of order n ^ we have 

1 ■J^ / 3 af + a2 \ ^ 
B2n^^}:^[X,-3^^X, ; 

^ i=l ^ ^ 

Using the Diagram Formula of Eq. ([3]) again, after some 
manipulations one obtains 



Var{B2n} ^9(Ti 



2 erf + (7| + 2c7la^ 



2. o 



33crf -f 33crf - 9cr?cr| - 9cr|crf 



SO that, for crj = 1, cr| = 3 we have 

33-1-33 x 27-81-27 
Var{B2n] z = 



■^4/2 2\ I 4/ 2 2\ 

-gO-l(o-2 - crj + -cr2(o-2 -f^l) 

Remark 3 Clearly the approximation of the linear term 
improves when the anisotropy decreases; in fact, the dif- 
ference Var {B2n} — Var {B^n} —J' as (t| —> af. For 
instance, taking cr^ = 1, ct| = 2 we have 

Var{B,.}^l6ld±^]^'-^^67.5 



Var{B3n}^6\- 



: 27 



q 27 

Var {B2n} - Var {B3,,} = -{aj ~ G\f{ul + ul) = - , 

whence 

Var {B2n} 6 I ^L±^ | = 30.375 . 

3. APPLICATION TO WMAP DATA 

3.1. The data 

In order to test the effect of the linear term correction 
we applied the needlet and SMHW estimators to the fore- 
ground reduced V + W maps of the WMAP 7- year data. 
For simplicity we coadded the two frequency band maps 
with a constant noise weight. We analyzed the maps 
at HEALPix resolution Ngide = 512 and masking out 
galactic foregrounds and point sources with t he extended 
temp erature analysis mask, known as KQ75 (|Gold et al.l 
|2011| ). Where Gaussian simulations are necessary we 
used the parameters from the WMAPl +BAO+Hn cos- 
mological data to simulate the CMB sky (|Komatsu et al.l 
12011). then applying the beam and noise properties sup- 
plied by the WMAP team. 

3.2. The needlets/wavelets /nl estimator with the 
linear term 

The /nl estimator based on the needlets bispectrum 
has been deve l oped and a pplied to WMAP data in in 
iRudiord et all (pOOOllMol) . We recall the main features 
of this estimator. The needlets bispectrum can be ex- 
pressed as 



'-313233 



where /3jk is the needlet coefficient at scale j and di- 
rection, i.e. pixel, k, defined in Eq. 1^, and ajk is the 
expected standard deviation of f3jk. The needlets bis- 
pectrum is used to estimate /nl by a minimization 
procedure: 

X'(/NL) = C1^(/NL)C-Id(/NL) 

where the data vector is 



(49) 



johs 



(50) 
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Here 1°^^ is the bispectruni of the observed data and 
{Ijij^ji) is the average first-order non-Gaussian bis- 
pectrum obtained from non-Gaussian simulations. In 
this analysis we used simulations with local-type non- 
Gaussianity generated with the algorithm described in 
lElsner and Wandelt I (|2009l ). The covariance matrix C 
is obtained by means of Monte Carlo simulations: 



a 



{didj) - {di){dj 



Differentiating Eq. (HH]) yields the estimate 



/nl — 



/f- - - \Tp-lrobs 

i^'jlhis)^^ ^{-^jlj2j3) 



(51) 



(52) 



Details of the estimation procedure can be found in 
IRudiord et all (2009). 

We want now to adopt the linear term correction in 
order to decrease the variance of the estimator (|52p . Fol- 
lowing Eq. ([5^ . it is straightforward to subtract the lin- 
ear term from the bispectrum in Eq. (j48p . The needlet 
bispectrum Ij^j^j.^ in the data vector (|5n|) can be ex- 
pressed now as 



robs 

jihh 



1 



- 3133 



k 

{k)lij2k - rjy3(fc)^j,fej, (53) 



where 



(54) 



and similarly for the SMHW where the fijk are replaced 
by the w{R,nk)- We note that the contribution of the 

linear term to the average (/jijaja) the data vector 
(j50p is vanishing, as the expected linear term value is 
always zero (Eq. ([55)) ). The terms ^jij2{k) contains one 
contribution from the CMB and one from the noise. The 
former is calculated using Monte Carlo simulations, the 
latter is calculated analytically. 

We implemented slightly different algorithms of the 
/nl estimator described above. The cubic statistic in 
Eqs. PSI [551) has been obtained in three cases, namely 
with standard needlets, Mexican needlets and SMHW 
coefficients. 

3.3. Analysis of the WMAP data 

In order to test the effect of the linear term correc- 
tion, we applied the /nl estimator to the WMAP data 
described in Section 13.11 both before and after the ad- 
dition of the linear term. The standard needlets co- 
eflacients has been obtained with two different bases: 
B = 1.781 with scales j = 1 - 13 and B = 1.34 with 
j = 3 — 22. For the Mexican needlets we chose B = 1.34 
and p = 1 with j = 3 — 22. In all the cases in ad- 
dition to the needlet scales we considered a "scaleO", 
i.e. the original map not convolved with the needlets. 
The sc ales selected f or the SMHW are the same scales 
used in lCurto et al.l (|2011al fB): Rq = (the uncolvolved 
map), Ri = 2.9, i?2 = 4.5, i?3 = 6.9, i?4 = 10.6, i?5 = 
16.3, i?6 = 24.9, i?7 = 38.3, i?8 = 58.7, i?9 = 90.1, i?io = 
138.3, iiii = 212.3, i?i2 = 325.8, i?i3 = 500.,i?i4 = 767.3 
arc minutes. The SMHW coefficients have been analyzed 
in two ways: as they are or after subtracting the scale- 
by-scale average of the coefficients outside the applied 



mask. Thereafter we tested the same "mean subtraction" 
procedure with standard and Mexican needlets. All the 
analysis are performed up to the multipole £ = 1500. 
With the exception of one case - where we analyzed the 



TABLE 1 

Masked fraction of the sky with the extended KQ75 masks 



SMHW 



Mexican needlets 



OljCllC -I- hi 


/o 




/o 


(map) 




1 
i 


DD.Z 


i 


OO A 


z 


DU.y 


9 




Q 
O 


oo.o 


Q 

o 


90 A 


A 


Ol .u 


A 
■ri 


90 A 

zy 


r: 



Q 


K 



90 A 

zy 


D 


^j_.o 


6 


29.4 


7 


39.1 


7 


29.6 


8 


36.9 


8 


30.6 


9 


35.3 


9 


33.3 


10 


34.2 


10 


37.8 


11 


33.4 


11 


44.5 


12 


32.7 


12 


54.0 


13 


32.3 


13 


67.4 


14 


32.0 


14 


83.4 


15 


31.6 






16 


31.3 






17 


31.0 






18 


30.7 






19 


30.4 






20 


30.2 






21 


30.1 






22 


30.0 






scaleO 


29.4 



full-sky with standard needlets B — 1.781 - we always 
applied the KQ75 mask. For the SMHW analysis it is 
necessary to properly extend the mask at each scale, be- 
cause the pixels near the border are affected by the zero 
values of the cut. Given the similarity with the SMHW, 
for comparison we extended the mask also for one case 
with the Mexican needlets. Det ails of the mask exten - 
sion procedure can be found in iMcEwen et"all ()2005D . 
The fractions of masked sky obtained with the extended 
masks at each scale are showed in Table [1] 
For each analyzed case we simulated a set of 51000 Gaus- 
sian maps with the same beam and noise properties of 
the WMAP coadded V + W cleaned map. 40800 sim- 
ulations have been used to obtain the covariance matrix 
in Eq. (j5ip . The standard deviation of the remaining 
10200 simulations gives the error bar associated with the 
estimated /nl values. Moreover, in order to verify that 
the /nl estimator is unbiased, for each case we also an- 
alyzed a set of 700 non-Gaussian simulati ons with an 
input /nl = 30 (jElsner and Wandelt Il2009l ). All the re- 
sults are reported in Table [H 



3.4. Results 

The addition of the linear term achieves a decrease 
in the standard deviation in all the cases. The full- 
sky analysis shows only a little improvement of the er- 
ror bar - 0.5% - indicating that the mask gives the ma- 
jor contribution to the correction operated by the lin- 
ear term. Applying the KQ75 mask, all the /nl val- 
ues estimated on the V + W foreground reduced maps 



On the linear term correction for needlets/wavelets NG estimators 



9 



TABLE 2 

Results 



case 


linear 


WMAP 


Error 


Act 


NG sims. 




term 




Bar 




(/nl> ^ 


standard needlets: 


B = 1.781 


no 


II 


18.4 




29.2 


i LlllaKy 


yes 


II 


18.3 


KJ.O 


29 2 


B = 1.781 


no 


63.5 


25.4 




29.9 


KQ75 


yes 


41.5 


22.2 


12.6 


30.0 


ID — ±.0^ 


no 


43.6 


24.9 




zy.o 


iW^ i 


yes 


39.3 


22.1 


119 
1 ± .Z 


9Q ^ 

zy.o 




Mexican needlets: 






Lj — ±.044, — ± 


no 


37.8 


25.4 




9Q ^ 


exL. iV*^ ( 


yes 


26.6 


22.2 




oU.o 


B = 1.34, p = 1 


no 


39.2 


23.2 




29.9 


KQ75 


yes 


37.5 


21.8 


6.0 


29.8 






SMHW: 








no mean subtr. 


no 


77.8 


27.3 




29.7 


ext. KQ75 


yes 


33.1 


21.9 


20.1 


30.1 


mean subtraction 


no 


37.5 


22.3 


18.3"= 


29.5 


ext. KQ75 


yes 


34.4 


22.0 


1.3 


29.7 




needlets 


- mean subtraction: 






standard 












B = 1.34 


no 


33.0 


22.5 


9.6= 


29.5 


KQ75 


yes 


37.1 


22.3 


0.8 


29.5 


Mexican 












B = 1.34,p = 1 


no 


33.2 


22.1 


4.7= 


30.0 


KQ75 


yes 


37.6 


22.0 


0.4 


29.8 



local /nl on coadded V + W foreground reduced maps; 
standard deviation over 10200 Gaussian simulations; 
t'^) [(T(no linear term) - (T(with linear term)]/(T(no linear term); 
average over 700 simulations with input /nl = 30; 
comparison with the "no mean subtracted" <t. 
See the text for further details. 



are consistent wit h the WMAP 7-yea r best estimate of 
/nl = 32 ± 21 fjKomatsu et al.ll2011[ ). We began the 
analysis choos i ng the same needlet base B — 1.781 as in 
iRudiord et al.l (|201ClD . With respect to the previous re- 
sult - /nl = 73 ± 31 - we already obtained a 18% smaller 
error bar analyzing the 7-year data in place of the 5-year 
release and using more scales. But a further improvement 
of 12.6% is given by the linear term, demonstrating that 
the correction is non-negligible. The average (/nl) over 
the 700 non-Gaussian simulations shows that the estima- 
tor is unbiased. We can note also that the /nl value esti- 
mated with the correction is closer to the WMAP result 
than without the linear term. We then chose a smaller 
needlet base B — 1.34, and therefore we constructed the 
cubic statistic of Eqs. [55]) with more needlet scales. 
We indeed note a smaller standard deviation of 24.9 even 
without the linear correction with respect to i? = 1.781. 
The error bar is decreased to 22.1, i.e. another 11.2%, 
adding the linear term. 

Moving to the Mexican needlets case, we started with 
a conservative analysis applying an extended KQ75 mask 
as for the SMHW case. Despite the reduced sky coverage 
(Table [ij we found the same la error bars obtained with 
standard needlets - B = 1.781 (but /nl values closer to 



WMA P). Motivated by the resuUs of iScodeller et al.l 
()201lD that has shown negligible neighbour bias effects 
of the mask, we repeated the Mexican needlets analysis 
with the original KQ75 mask. Looking at the results on 
the non-Gaussian simulations, we notice that the (/nl) 
are indeed unbiased. In this case the linear term correc- 
tion leads to a 6% improvement of the error bar, sufficient 
to achieve our best estimate /nl = 37.5 ± 21.8. This 
standard deviation is very close to the WMAP result 
with the optimal KSW estimator, where a = 21. With 
the addition of the linear term correction the needlets 
/nl estimator is almost optimal. We believe that further 
standard deviation reductions can be achieved with an 
optimal V + W coadding and with a different choice of 
the needlet base B and scales j. Anyway these exploita- 
tions are behind the primary scope of this work. 
Furthermore we consider the case with the SMHW coeffi- 
cients. The extensions of t he KQ75 mask return s sky cov- 
erage percentages close to iCurto et al.l (|2011al lH) . With- 
out subtracting the scale-by-scale coefficients average, 
the linear term correction leads to a reduction of the error 
bar from 27.3 to 21.9, equal to 20.1%. With a very similar 
SMHW ana. l ysis on coadded V + W 7-yr cleaned maps, 
ICurto et all (l2011bD found /nl = 32.5 ± 23 (Fisher ma- 
trix bound ap = 22.5), where here we considered the re- 
sult uncorrected from point-source contribution. There- 
fore the linear term ad dition rather improve s their result. 
However the analysis of lCurto et al.l (|2011b[ ) is performed 
after the scale-by-scale mean subtraction. Following the 
same procedure, we indeed achieved a reduction of the 
error bar without the linear term correction: from 27.3 
to 22.3, equal to 18.3%. The a f ound subtracting th e 
mean is indeed closer to the one of lCurto et al.l ()2011b[ ) . 
Anyway we note that also in this case the addition of 
the linear term can still slightly improve the error bar 
(1.3%), and the a found with this correction - subtract- 
ing or not subtracting the mean - is almost identical. 
Motivated by the SMHW analysis, we tested the scale- 
by-scale mean subtraction procedure with both the stan- 
dard and the Mexican needlets. We found that also in 
the needlets case this procedure well approximates the 
linear term correction, providing an error bar reduction 
of 9.6% (i.e. from 24.9 to 22.5) and 4.7% (i.e. from 
23.2 to 22.1) for standard and Mexican needlets respec- 
tively. But again the addition of linear term provides 
even smaller error bars: 0.8% and 0.4% respectively. 
Moreover we obtained the smallest a with the linear term 
but without subtracting the mean. We indeed got 22.1 
against 22.3 for the standard needlets, and 21.8 against 
22.0 for the Mexican needlets. 

4. CONCLUSIONS 

In this paper, we have derived for the first time the 
linear correction term for wavelets/needlet /nl estima- 
tors. As expected, under ideal experimental circum- 
stances (isotropic noise, no masked regions) this term 
turns out to be identically zero. Under a realistic ex- 
perimental set-up, the term is non-negligible, although 
smaller than for the KSW estimators; this is due to 
the localization properties of wavelets/needlets statistics, 
which soften to some extent the effects of unobserved re- 
gions and anisotropic noise. We have also argued that the 
linear correction term is well- approximated by scale-by- 
scale mean subtraction, thus providing an explanations 
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for recent results from ()Curto et al.ll2011al lH). where nu- 
merical estimates on the variance of wavelet estimators 
where shown to be very close to the KSW bound; in 
fact, we have shown that mean subtraction can cover 
approximately 85-90% of the linear term effect under 
WMAP-\ike circumstances. The procedures we advo- 
cate are applied to WMAP 7-ycar V + W foreground 
cleaned data, confirming that the corrections achieved 
are non- negligible. Applying the linear term correction 
we obtained the best estimate /nl = 37.5 ± 21.8. The 
error bar is very close to the optimal standard devia- 
tion a = 21 found by the WMAP team (jKomatsu et al.l 

[mil). 

In view of these results, we argue that wavelets/needlets 
statistics can provide a statistically sound and computa- 



tionally convenient technique for non-Gaussianity analy- 
sis on CMB data. 
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